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Abstract 

The G-protein-coupled receptor (GPCR) signaling system is one of the main signaling pathways in eukaryotes. Here, we analyze the 
evolutionary history of all its components, from receptors to regulators, to gain a broad picture of its system-level evolution. Using 
eukaryotic genomes covering most lineages sampled to date, we find that the various components of the GPCR signaling pathway 
evolved independently, highlighting the modular nature of this system. Our data show that some GPCR families, G proteins, and 
regulators of G proteins diversified through lineage-specific diversifications and recurrent domain shuffling. Moreover, most of the 
gene families involved in the GPCR signaling system were already present in the last common ancestor of eukaryotes. Furthermore, 
we show that the unicellular ancestor of Metazoa already had most of the cytoplasmic components of the GPCR signaling system, 
including, remarkably, all the G protein alpha subunits, which are typical of metazoans. Thus, we show how the transition to 
multicellularity involved conservation of the signaling transduction machinery, as well as a burst of receptor diversification to cope 
with the new multicellular necessities. 
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Introduction 

A molecular system to receive and transduce signals from the 
environment or from other cells is key to multicellular organ- 
isms (Gerhart 1999; Pires-daSilva and Sommer 2003), 
although molecular signaling pathways are not uniquely 
required within a multicellular context. Unicellular species 
face similar signaling needs as multicellular organisms, dealing 
with a changing environment and, in some cases, coordinat- 
ing different cells (e.g., density sensing) (Crespi 2001; King 
2004; Rokas 2008). 

Both animals (metazoans) and plants have evolved complex 
signaling pathways to govern their embryonic development, 
and, according to current genomic data, some of these path- 
ways appear to be specific to either metazoans or plants. This 
is the case of the metazoan-specific WNT and Hedgehog sig- 
naling pathways (Ingham et al. 2011; Niehrs 2012) and the 
land plant-specific auxin and cytokinin (Rensing et al. 2008). 
Other signaling pathways, such as the metazoan Notch 



pathway, have instead been assembled from various, more 
ancient components by domain shuffling (Gazave et al. 
2009). Finally, other signaling pathways were already present 
in the unicellular ancestors and were subsequently co-opted 
for multicellular functions. A good example are the receptor 
tyrosine kinases, which emerged and expanded in unicellular 
holozoans (i.e., choanoflagellates and filastereans), and were 
later recruited for developmental control in metazoans (King 
et al. 2008; Manning et al. 2008; Suga et al. 201 2). The reuse 
of previously assembled signaling systems is indeed an impor- 
tant mechanism of signaling pathway co-option in multicellu- 
lar lineages (King et al. 2008). 

One of the major eukaryotic signaling pathways is the 
G-protein-coupled receptors (GPCRs) and their associated sig- 
naling modules (Fritz-Laylin et al. 2010; Anantharaman et al. 
201 1 ; Krishnan et al. 2012), which are conserved from exca- 
vates to animals. GPCRs are involved in many processes apart 
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from developmental control, such as cell growth, migration, 
density sensing, or neurotransmission (Bockaert and Pin 1 999; 
Pierce et al. 2002; Rosenbaum et al. 2009). GPCRs are able to 
sense a wide diversity of signals, including proteins, nucleo- 
tides, ions, and photons. Structurally, GPCRs have a seven 
transmembrane (TM) domain (they are also known as 7TM 
receptors), which forms a ligand-binding pocket in the extra- 
cellular region, and a cytoplasmic G-protein-interacting 
domain (Pierce et al. 2002; Lagerstrom and Schioth 2008), 
which binds to G proteins to mediate intracellular signaling. 
G proteins form a heterotrimeric complex that is disassembled 
when activated by the GPCR, which acts as a guanidine ex- 
change factor (GEF), and transduce the signal into down- 
stream effectors (Oldham and Hamm 2008). The G protein 
heterotrimeric complex has three different subunits of distinct 
evolutionary origin, alpha, beta, and gamma. G protein het- 
erotrimeric signaling is, in turn, regulated by various proteins 
families, including Regulators of G protein Signaling (RGS) and 
GoLoco-motif-containing proteins (Pierce et al. 2002; 
Siderovski and Willard 2005; Wilkie and Kinch 2005). The 
combination of GPCR, G proteins, and their regulators results 
in many diverse signaling outputs. 

Besides the classic GPCR-G protein signaling system de- 
scribed earlier, there are alternative upstream and down- 
stream molecules (fig. 1). For instance, seven TM receptors 
associated to RGS antagonize "self-activated" G alpha pro- 
teins in some lineages, acting as GTPase-accelerating proteins 
(GAP) receptors (Urano et al. 2012; Bradford et al. 2013). In 
plants, a single pass TM receptor has been recently character- 
ized to interact with G alpha proteins (Bommert et al. 2013). 
Moreover, monomeric G protein alpha activation by Ric 8 
(resistance to inhibitors of cholinesterase 8) is also GPCR inde- 
pendent (Wilkie and Kinch 2005; Hinrichs et al. 2012), and 
beta/gamma heterodimers are regulated via phosducins 
(Willardson and Howlett 2007). Complementarily, GPCRs 
can perform downstream signaling independently of G pro- 
teins by G protein-coupled receptor kinases (GRKs), Arrestins, 
and Arrestin domain-containing proteins (ARDCs) (Gurevich 
W and Gurevich EV 2006; Reiter and Lefkowitz 2006; 
DeWire et al. 2007; Liggett 2011; Shenoy and Lefkowitz 
2011). 

Most of the proteins involved in the GPCR signaling path- 
way have previously been analyzed as single units in various 
phylogenetic contexts (Blaauw et al. 2003; Fredriksson and 
Schioth 2005; Alvarez 2008; Oka et al. 2009; 
Anantharaman etal. 2011; Krishnan et al. 2012; Mushegian 
et al. 2012; Bradford et al. 2013). However, not much atten- 
tion has been paid to the system-level evolution of the entire 
pathway, and given the modularity of the system, it is impor- 
tant to investigate its evolution from a global point of view. 

In this article, we provide an update on the evolutionary 
histories of all components of the GPCR signaling system using 
a genomic survey that includes representatives of all eukaryote 
supergroups. We analyze the modular structure of the 
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Fig. 1. — Schematic representation of the GPCR signaling pathway. 
Protein families belonging to similar functional categories are grouped as 
specified in the color legend. 

signaling pathway and show how different parts of the 
system coevolved in complementary or independent patterns. 
We also reconstruct the GPCR signaling system in the last 
common ancestor of eukaryotes (LECA) and track its evolution 
in various lineages. Finally, we analyze the evolution of the 
system in the transition from unicellular ancestors to meta- 
zoans. We observe strong conservation in the pathway com- 
ponents associated with cytoplasmic signaling transduction, 
whereas receptors radiated extensively in metazoans, becom- 
ing one of the largest gene families in metazoan genomes 
(Fredriksson and Schioth 2005). The dissimilarity between 
the pattern of evolution in preadapted signaling transduction 
machinery and active diversification of receptors provides 
clues on how key innovations in metazoan complexity could 
have evolved from pre-existing machineries. 

Materials and Methods 

Taxon Sampling and Data Gathering 

The 75 publicly available genomes used in this study were 
downloaded from databases at National Center for 
Biotechnology Information, The Joint Genome Institute, and 
The Broad Institute. Data from some unicellular holozoan spe- 
cies come from RNAseq sequenced in-house (Pirumgemmata, 
Abeoforma whisleri, and Corallochytrium limacisporum) or 
from The Broad Institute "Origin of Multicellular^ 
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Database" (Ministeria vibrans and Amoebidium parasiticum). 
The RNAseq transcripts were translated into six frames. 

All the protein domains that are components of the GPCR 
signaling machinery were selected from the literature and the 
PFAM database (Punta et al. 2012). All proteomes were 
scanned using PfamScan with PFAM A version 26 as query 
and selecting the gathering threshold option. Gathering 
threshold is important in the case of GPCRs, because it 
helps to disambiguate between different GPCR families by 
selecting the most significant hit. Additionally, PfamScan gath- 
ering threshold avoids the spurious partial hits typical of TM 
proteins and is a conservative approach to minimize false pos- 
itives that may arise with other more sensitive methods (Punta 
et al. 2012). General distribution patterns were obtained by 
counting proteins with at least one domain belonging to the 
GPCR signaling machinery present in the PfamScan proteomic 
outputs. The same files were used to obtain multidomain ar- 
chitectures, with the exception of the TM domains analyzed in 
RGS proteins, which were obtained using the TMHMM soft- 
ware (Krogh et al. 2001). In the case of G protein gamma 
subunits, additional TBIastN searches against reference ge- 
nomes were performed to avoid false negatives using bikont 
and opisthokont sequences as query. Gene loss is very difficult 
to assess due to the different degrees of incompleteness of the 
available genomes. To overcome this problem we used, when 
possible, more than one taxa for each eukaryotic clade. 
Transcriptome data do not account for gene loss, as genes 
can be missed due to low expression, but in our data set most 
species with transcriptomic data have sister species with 
genome sequence available. 

Heatmaps, Principal Component Analysis, and Parsimony 
Reconstruction 

Heatmaps were built using R heatmap.2 function, from the 
gplots package. Principal component analysis (PCA) was car- 
ried out using the built-in R prcomp function, with scaling and 
a covariance matrix, and were plotted using the R bpca pack- 
age. We assumed Dollo parsimony to infer ancestral gains and 
secondary loss reconstructions in figure 2 using Mesquite 
(Maddison WP and Maddison DR 201 1). 

Phylogenetic Analyses 

Arrestins/ARDCs, Ric8, G alpha subunit, G beta subunit, 
Phosducin, Kinase, and RGS domains were used for phyloge- 
netic analyses. The alignments were obtained using MAFFT 
with the L-INS-i option (Katoh and Standley 2013), and 
these alignments were manually trimmed to avoid ambiguous 
regions. Seed alignments are available in supplementary file 1 , 
Supplementary Material online. The amino acid model of evo- 
lution used for phylogenetic inference was LG, with a discrete 
gamma distribution of among-site variation rates (four cate- 
gories) and a proportion of invariable sites. 



Maximum likelihood analyses were performed using 
RAxML version 7.2.6. (Stamatakis 2006). The best-tree topol- 
ogy depicted in the figures was obtained by selecting the best 
tree out of 100 replicates. Bootstrap support was obtained 
using 100 bootstrap replicates of the same alignment. 
Bayesian inference trees were inferred using PhyloBayes v3.3 
(Lartillot et al. 2009). The resulting tree and posterior proba- 
bilities were obtained when two parallel runs converged (tra- 
cecomp standard values), after surpassing at least 500.000 
generations. The runs were sampled every 100 generations, 
and the burn-in was established using a bpcomp 
maxdiff < 0.3. 

Results 

GPCR Families: Ancient Origins and Architecture 
Diversifications 

A widely accepted classification of the metazoan GPCR com- 
plement is the GRAFS system, which is based on both phylog- 
eny and structural similarity (Fredriksson et al. 2003; 
Fredriksson and Schioth 2005; Lagerstrom and Schioth 
2008; but see Pierce et al. [2002] for an alternative classifica- 
tion). The GRAFS system divides GPCRs into five different fam- 
ilies, Glutamate (also known as Class C), Rhodopsin (Class A), 
Adhesion (Class B), Secretin (class B), and Frizzled (Class F). 
This system can be extended to GPCR types described in 
nonmetazoans, including the cAMP (Class E), ITR-like and 
GPR-1 08-like families, as well as several lineage-specific recep- 
tor families such as insect odorant receptors, nematode che- 
moreceptors, or vertebrate vomeronasal receptors (Nordstrom 
et al. 201 1). Fungi also have well-defined GPCR families such 
as Ste2 and Ste3 (both included in Class D), and Git3 and plant 
Abscisic acid receptors are also thought to be GPCRs 
(Plakidou-Dymock et al. 1998; Tuteja 2009; Krishnan et al. 
201 2). Most GPCR families are associated with a characteristic 
PFAM domain (Fredriksson et al. 2003; Fredriksson and 
Schioth 2005; Lagerstrom and Schioth 2008). 

First, we assessed the presence and abundance of GPCR 
family domains in diverse eukaryotic genomes (see fig. 2 for a 
complete taxon sampling). Our data show that the distribution 
of GPCR families in eukaryotes follows two distinct evolution- 
ary patterns. Some families are pan-eukaryotic, whereas 
others are biased toward amorpheans (unikonts). For instance, 
GRAFS are more abundant in amorpheans, especially in meta- 
zoans, although some (Glutamate, Adhesion/Secretin, and 
Rhodopsin) are also observed in some bikonts. Other families, 
such as cAMP receptors, Git3, ITR-like, GPR-1 08-like, and 
Abscisic acid receptors, are found in similar abundance 
among eukaryotes. Interestingly, non-GRAFS GPCR families 
are never expanded in any species (<10 members in all 
genomes). We also surveyed the taxonomically restricted 
metazoan families, and although we found chemosensory re- 
ceptors (7tm_7) and the Serpentine type chemoreceptors Srw 
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Fig. 2. — Distribution and abundance of GPCR signaling components in 78 eukaryotic genomes. Numbers and abundance of domain containing proteins 
are depicted according to the color legend in the upper left, being black absence of the given domain in a given species. Yellow color indicates smaller 
amounts, whereas the scale to purple indicates more abundance. The various domains are grouped into functional modules specified in figure 1 , as shown in 
the schema at the bottom right. Species marked with an asterisk are only covered by RNA-seq data, therefore gene absence is not definitive. The original 
numbers of the heatmap are available at supplementary table S1, Supplementary Material online. 



and Srx in some previously unreported metazoan genomes, 
none were observed in nonmetazoan eukaryotes (supplemen- 
tary figs. S1 and S2, Supplementary Material online), with 
exception of OA1 (Ocular Albinism receptor), which is specific 
to metazoans and Capsaspora owczarzaki. These results indi- 
cate that most GPCR families have ancient origins in the last 
eukaryotic common ancestor. 



Diversification of ancient GPCR families is usually accom- 
panied by architectural diversification of the N-terminal 
protein domain (Lagerstrom and Schioth 2008). Thus, we 
analyzed the architectural diversity of each GPCR family in 
each genome and observed two types of GPCRs in terms of 
N-terminal domain diversity (diversifying vs. nondiversifying in 
supplementary fig. S2, Supplementary Material online). Some, 
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Fig. 3. — Conservation of the domain architecture of different GPCR signaling components across eukaryotic genomes. A black dot indicates the 
presence of a given domain architecture. A white dot refers to similar domain architecture, Tyrosine Kinase instead of Serine/Threonine kinase in the case of 
Choanoflagellate GRK-like genes. For simplicity, only the most common architectures are shown. The percentage of genes found with a given architecture 
within a family is indicated at the bottom part of the table, as well as the total number of genes within the family. GPM in the first column of GoLoco motif 
containing proteins stands for G Protein Modulator/Rapsynoid. The complete domain architectures of the GPCR signaling system components are found in 
supplementary figures S3, S4, and S6, Supplementary Material online. 



such as Glutamate, Adhesion/secretin, and, to a lesser extent, 
Rhodopsin, are susceptible to the recruitment of new domains 
in the N-terminal region, especially in Metazoa, whereas 
others, such as cAMP, Git3, OA1, Abscisic acid receptors, 
GPR1 08-like, and ITR-like, have substantially lower diversity 
of protein domains at the N-terminal. This result suggests 
that some GPCR families have functional constraints, whereas 
others are prone to diversify through recruitment of concur- 
rent domains. 

To gain further insights into domain diversification, we 
searched for evolutionary conservation of specific protein 
domain architectures (fig. 3) and found that some architec- 
tures are highly conserved across lineages. For example, 
Glutamate receptors (7tm_3) have protein domain configura- 
tions that are conserved in distant eukaryotic lineages, includ- 
ing those with Venus Flytrap module (ANF_receptor), OpuAC, 
or Bmp domains (fig. 3). Additionally, several nonmetazoan 
species have diversified their own species-specific configura- 
tions of glutamate receptors (supplementary fig. S3, 
Supplementary Material online). The Adhesion family is also 
quite structurally diverse, especially in metazoans and, to a 



lesser extent, unicellular holozoans (supplementary fig. S3, 
Supplementary Material online). Similarly, the Rhodopsin 
family is architecturally diversified, mainly in metazoans. 
Finally, Fz-Frizzled, RpkA (cAMP-PIP5K domain architecture), 
and Git3-Git3_C protein domain architectures could be iden- 
tified in several eukaryotic genomes (supplementary fig. S4, 
Supplementary Material online), expanding the previous dis- 
tribution of those architectures at LECA or at the root of 
Amorphea/Unikonta. Remarkably, most of the GPCR complex 
architectures belong to GRAFS families and are mostly diver- 
sified and conserved within metazoans. 

Heterotrimeric G Protein Complex 

GPCRs typically signal through G proteins. In an inactive state, 
the three G protein subunits (alpha, beta, and gamma) form a 
heterotrimeric complex (Pierce et al. 2002; Oldham and 
Hamm 2008) (fig. 1). When a ligand activates a GPCR, it 
acts as a GEF, promoting GDP to GTP exchange in the G 
alpha subunit. This exchange alters G alpha subunit confor- 
mation and promotes the disaggregation of the heterotrimeric 



610 Genome Biol. Evol. 6(3):606-619. doi:10.1093/gbe/evu038 Advance Access publication February 23, 2014 



Evolution of the GPCR Signaling System in Eukaryotes 



GBE 



complex. The active G alpha subunit and an active dinner of 
beta and gamma subunits mediate further 
downstream signaling through various effectors (Milligan 
and Kostenis 2006; Oldham and Hamm 2008). G alpha is a 
low-efficiency GTPase, whereas G beta has various WD-40 
repeats (PF00400) and G gamma is a small protein containing 
a conserved domain (Milligan and Kostenis 2006; 
Anantharaman et al. 2011). 

Using the signature domains of each G protein, we sur- 
veyed our data set to find their general distribution patterns 
and found that the abundance of each subunit varies mark- 
edly across eukaryotes and that some taxa have lost these 
three subunits entirely (Anantharaman et al. 201 1). G protein 
alpha is the most susceptible to diversification, and, interest- 
ingly, beta and gamma subunits have multiple copies in 
G alpha rich species. Although combination of the three 
elements is important for signaling plasticity, G alpha is the 
most evolutionarily dynamic of the three G proteins. 

To gain further insights into the evolution of G alpha pro- 
teins, we performed phylogenetic analyses using our eukary- 
otic data set (fig. 4), and the resulting tree shows that several 
groups have lineage-specific diversifications, such as those in 
Naegleria gruberi, Bigelowiella natans, and Emiliania huxleyi. 
The opisthokonts have a diverse but conserved repertoire of 
G alpha proteins. Fungi have four distinct paralogs (GPA-1 to 
4) present in Ascomycota, Basidiomycota, Mucoromycotina, 
and Chytridiomycetes (families reviewed in Li et al. 2007) and 
therefore were most likely present in the fungal ancestor. 
Holozoa also have four ancient paralogs, Gas, Gocq/12/13, 
Goti/o, and Gocv (described for Metazoa in Oka et al. 2009). 
It is worth mentioning that all the metazoan G alpha families 
are conserved in the unicellular relatives of Metazoa, indicat- 
ing that they originated prior to the diversification of meta- 
zoans from the rest of holozoans. 

We also identified a new and divergent family of holozoan 
G alpha subunits that branches out from the Opisthokonta 
clade, comprising Nematostella vectensis, Lottia gigantea, and 
other holozoans (fig. 4). Additionally, we observed a cluster of 
conserved G alpha subunits in several distant eukaryotic line- 
ages (what we call conserved-eukaryotic group I): 
Ichthyosporea, Allomyces macrogynus, and dictyostelids 
within the Amorphea, and B. natans and Ectocarpussiliculosus 
within the bikonts. It is likely that this particular family origi- 
nated in the LECA and was lost many times during eukaryotic 
evolution. 

We also performed a phylogenetic analysis of eukaryotic 
befa-subunits, to compare the evolutionary histories of alpha 
and beta (supplementary fig. S5, Supplementary Material 
online). Our tree shows that holozoans have a particular an- 
cient duplication, G|31 -4 and G(35, with the more derived G|35 
known to interact with G gamma-like subunits, such as RGS7 
(Sondek and Siderovski 2001; Anderson et al. 2009), a 
multidomain protein that contains a G gamma domain. We 
identified RGS7 in both chytrid fungi and holozoans (fig. 3 and 



supplementary fig. S6, Supplementary Material online), and 
therefore, the ancient duplication of G protein beta and its 
partner, RGS7, are ancient features of holozoans. 

Regulatory Proteins: RGS and GoLoco 

Regulation of G proteins is a key step in GPCR signaling that 
involves two main protein families, RGS and GoLoco motif- 
containing proteins (Siderovski and Willard 2005; Wilkie and 
Kinch 2005). RGS proteins act as GAP, turning GTP into GDP 
and thereby promoting the formation of the G protein hetero- 
trimeric complex and completing G alpha signaling (Siderovski 
and Willard 2005). Nevertheless, not all RGS domains act as 
GAP proteins in G protein signaling, and some have lost their 
GAP activity and have developed scaffolding functions 
(Anantharaman et al. 2011). GoLoco-motif-containing pro- 
teins (also known as G protein regulators) act as guanine dis- 
sociation antagonists, inhibiting the dissociation of the 
heterotrimeric complex by binding to G alpha-GDP and block- 
ing downstream signal transduction (Siderovski and Willard 
2005). 

We traced the distribution and abundance of RGS and 
GoLoco motif proteins in eukaryotes and found that RGS is 
present in many different eukaryotes, mainly coinciding with 
the presence of heterotrimeric subunits (fig. 2). The number of 
RGS varies from one single copy in some taxa to numerous 
copies in other lineages. For example, some eukaryotes such 
as N. gruberi (229), B. natans (39), £ siliculosus (47), or the 
ichthyosporeans (22-119) have more RGS proteins than 
Homo sapiens (34), whereas other multicellular lineages 
such as plants possess only one copy. In contrast, the 
GoLoco motif appears to be exclusive to metazoans and choa- 
noflagellates (figs. 2 and 3), and although its copy number 
may vary, it is less abundant than RGS. Therefore, our data 
show that the eukaryotic RGS system underwent independent 
radiations in lineages including amoebozoans, ichthyospor- 
eans, heteroloboseans, and rhizarians, whereas GoLoco is a 
later development that originated prior to the divergence of 
choanoflagellates and metazoans. 

We then examined the architectural configurations of RGS 
proteins, as they are known to combine with many other 
protein domains (Siderovski and Willard 2005; 
Anantharaman et al. 2011). Our survey shows that species 
with distant phylogenetic relationships to each other evolved 
their own architectural repertoires and generally have unique 
configurations that are not found elsewhere (supplementary 
fig. S6, Supplementary Material online). Moreover, many con- 
figurations evolved independently, recruiting the same 
domain in different configurations. For example, DEP, cNMP 
binding, Kinases, Rho GTPase, Leucine-Rich Repeat, START, 
and Ankyrin repeats are all present in various combinations 
in RGS genes from divergent taxa (shown in red in supple- 
mentary fig. S6, Supplementary Material online). However, 
some complex multidomain architectures are evolutionary 
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conserved (fig. 3 and supplementary fig. S6, Supplementary 
Material online). For example, opisthokonts share some 
common RGS architectures, namely Sorting Nexins (SNX13/ 
14/25) and the previously mentioned RGS7. Additionally, the 
RGS-like domain, typical of PDZ-RhoGEF, is an innovation of 
Holozoa (fig. 2), whereas RGS12 and Axin are metazoan in- 
novations (supplementary fig. S6, Supplementary Material 
online). Our results emphasize that metazoans and their uni- 
cellular relatives have conserved elements of RGS comple- 
ment, which is quite susceptible to diversification through 
domain rearrangements. 

Of specific interest are RGS proteins with TM domains 
(Anantharaman et al. 2011; Urano et al. 2012; Bradford 
et al. 2013), as they localize to the cell membrane next to 
heterotrimeric G proteins. We found that in most lineages, 
RGS is fused to at least one TM domain (supplementary fig. 
S7, Supplementary Material online) but in apusozoans, amoe- 
bozoans, and haptophytes. In plants and other eukaryotes, 
RGS domains have been observed together with 7TM organi- 
zations, somehow resembling a GPCR but with the opposite 
effect on G proteins (Urano et al. 201 2; Bradford et al. 201 3). 
Many bikonts possess 7TM-RGS architectures, but we found 
that chytrid fungi, filastereans, and ichthyosporeans also have 
this type of receptors, whereas metazoans do not, suggesting 
that metazoans dispensed with GAP TM signaling and re- 
stricted on typical GPCR signaling. 

GoLoco motif-containing proteins are also part of multido- 
main proteins. Our results show that choanof lagellates have a 
unique configuration (SH2-GoLoco) and a shared architecture 
with metazoans, G-protein-signaling modulator/Rapsynoid 
(fig. 3 and supplementary fig. S6, Supplementary Material 
online). Metazoans have some additional conserved architec- 
tures, such as RGS12/RGS14 and RaplGAP (supplementary 
fig. S6, Supplementary Material online). 

Upstream Alternative Regulators: Ric8 and Phosducin 

Ric8 is a long domain that acts as a GEF, activating G alpha 
subunits in the absence of GPCR signaling, or as a chaperone 
to stabilize G alpha (Hinrichs et al. 2012; Chan et al. 2013). 
Ric8-mediated activation of monomeric G alpha is involved in 
development and signaling in metazoans, fungi, and 
Dictyostelium (Hinrichs et al. 2012; Kataria et al. 2013). 
Although we found Ric8 in almost all amorpheans, suggesting 
it was secondarily lost in some species (Microsporidia, 
Thecamonas trahens, and Entamoeba histolytica) (fig. 2), it is 
rare in bikonts and found only in a small number of 
Heterokonta. The presence of Ric8 in only a few heterokonts 
could be explained by horizontal gene transfer, although our 
phylogenetic analysis does not support this hypothesis (sup- 
plementary fig. S8, Supplementary Material online), but sug- 
gests instead that Ric8 was present in the LECA and 
secondarily lost in many eukaryotic lineages. 



Phosducins belong to a small and ancient gene family, 
Phosducin-like (Blaauw et al. 2003; Willardson and Howlett 
2007), and act as cochaperones of the G beta/gamma dimers, 
allowing normal dimer configuration and transiently inhibiting 
their junction with G alpha (Willardson and Howlett 2007). 
We performed a phylogenetic analysis of Phosducin-like pro- 
teins, and the resulting tree shows three great clades: 
Phosducin I, Phosducin ll/lll, and orphan phosducin (supple- 
mentary fig. S9, Supplementary Material online). The only 
one known to interact with G protein beta subunits is the 
Phosducin-I or Phosducin/Phl_P1 clade (Blaauw et al. 2003), 
and this is further reinforced by the fact that most species 
that have Phosducin I proteins also possess the heterotrimeric 
beta subunit. Conversely, the phosducin-ll/lll clade includes 
chlorophyte sequences, a group that lacks G protein signaling. 
This suggests that proteins belonging to the phosducin-ll/lll 
clade have substrates other than G proteins (Willardson and 
Howlett 2007). 

Alternative Signaling Inputs: GRK, Arrestins, and ARDCs 

GPCRs can also signal independently of G proteins, which is 
mainly achieved through interactions with GRKs and Arrestins, 
where Arrestins can either antagonize G protein signaling or 
connect GPCRs to other signaling modules (Gurevich W and 
Gurevich EV 2006; Reiter and Lefkowitz 2006; DeWire et al. 
2007; Shenoy and Lefkowitz 2011). GRKs have an active 
kinase domain and an inactive RGS domain, which allows it 
to scaffold with GPCRs. Similar to other kinases (e.g., PKC and 
PKA), GRKs phosphorylate active GPCR receptors in a process 
called desensitization, inhibiting the GPCR and allowing 
Arrestin binding. Arrestin binding promotes receptor internal- 
ization by endocytosis, which can result in ubiquitination or 
recycling of the GPCR (Pierce et al. 2002; Gurevich VV and 
Gurevich EV 2006; DeWire et al. 2007). Additionally, Arrestins 
can also act as adaptors for other signal transduction path- 
ways such as MAPK or Akt (DeWire et al. 2007). Thus, under- 
standing the evolutionary dynamics of Arrestin/GRK signaling 
is key to building a complete picture of GPCR signaling. 

We found that GRK-like proteins are present in a reduced 
subset of eukaryotes, including Holozoa, Dictyostelida, 
Heterokonta, and Haptophyta (Mushegian et al. 2012) (sup- 
plementary figs. S10 and S11, Supplementary Material 
online). Our phylogenetic analysis supports the duplication 
of GRKa and GRKb paralog groups at the root of Holozoa, 
as some sequences belonging to filastereans and ichthyospor- 
eans branch within the GRKa clade (supplementary fig. S10, 
Supplementary Material online). Nevertheless, using the 
kinase domain to unravel the evolutionary history of GRKs, 
some RGS-kinase architectures seem to be convergent, choa- 
nof lagellate and dictyostelid RGS are fused to a tyrosine kinase 
like, instead of being fused to an AGC kinase (supplementary 
fig. S11, Supplementary Material online). Although the ab- 
sence of GRK in many GPCR rich genomes is not surprising, 
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because other kinases can replicate this function, holozoans 
retained two paralogs of this specialized kinase. 

Although GRKs are rather scarce in eukaryotes, ARDCs are 
broadly distributed, and our survey shows that most eukary- 
otes have a variable number of ARDCs (fig. 3). To gain insights 
into the evolutionary history of Arrestins and ARDCs, we per- 
formed a phylogenetic analysis and identified three major 
clades, though with low nodal support (supplementary fig. 
S12, Supplementary Material online). One clade includes 
metazoan Arrestins, as well as several sequences from unicel- 
lular holozoans, making Arrestins a premetazoan invention. 
The tree also shows a large lineage-specific expansion of 
ARDCs in Ciliophorans, fungal clades dominated by 
Mucoromycotina sequences, and the metazoans 
Caenorhabditis elegans, Drosophila melanogaster, and 
Trichoplax adhaerens. Interestingly, both Arrestins and 
ARDCs are known to interact with GPCR (Alvarez 2008), 
and therefore, their presence and expansion suggest a com- 
plementary system to G protein signaling. 

GPCR Signaling System 

After addressing the evolutionary histories of the various com- 
ponents of GPCRs and their signaling modules, we analyzed 
them at system level by reducing the diversity of molecules 



into the main functional categories and analyzing their revo- 
lution (fig. 5). Our data show that holozoans, fungi, amoe- 
bozoans, heterokonts/stramenopiles, haptophytes, rhizarians, 
and heteroloboseans have most of the components of the 
GPCR signaling system, whereas others, such as Giardia lam- 
blia and the miscrosporidians, are completely reduced. Other 
lineages have retained only a subset of the components in- 
volved in GPCR signaling, which challenges general views on 
the basic mechanics of the system. First, Abscisic acid recep- 
tors (PF12430) and GPR-108-like (PF06814) are present in ge- 
nomes where most of the GPCR signaling system has been 
lost (such as Cyanidioschyzon merolae and Leishmania major, 
see fig. 2), which implies that their role as GPCRs is doubtful, 
as previously suggested (Maeda et al. 2008; Anantharaman 
etal. 2011). 

Furthermore, there are other taxa in which some GPCRs are 
present, even though the heterotrimeric complex is absent (or 
partially absent). For example, the apusozoan T. trahens, 
which lacks heterotrimeric subunits, has four cAMP receptors 
and one Adhesion receptor, all of which are canonical GPCRs. 
Similarly, ciliophorans, which only have the G protein subunit 
beta, have members of Rhodopsin, Adhesion, cAMP, and ITR- 
like receptors. Interestingly, both T. trahens and ciliophorans 
have ARDCs, in high numbers in the latter group, suggesting 
that ARDCs might provide an alternative link between GPCRs 
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and other signal transduction pathways in those lineages. This 
is not the case in Guillardia theta, however, which has cAMP 
and ITR-like GPCRs but neither G proteins nor ARDCs. All 
these data suggest that GPCRs might be connected to alter- 
native signaling modules other than G proteins. 

The modularity of the GPCR signaling system is further 
supported by the fact that various G protein subunits can be 
found independently of the other subunits. For example, the 
G alpha subunit, but not the G beta and gamma subunits, is 
present in Trichomonas vaginalis and Cyanophora paradoxa. 
The former has 7TM-RGS proteins, which, in the absence of 
GPCR and two of the components of the heterotrimeric com- 
plex, may be interacting with other signaling pathways 
(Bradford et al. 2013), but no RGS is detected in C. paradoxa. 
Ciliophorans only have the G beta subunit but have several 
Phosducin-like genes, which may also imply that ciliophorans 
have co-opted Phosducin and G protein beta into a distinct 
function. Additionally, T. trahens has an RGS protein with no 
obvious function due to the absence of G alpha subunits. 
Thus, the evolutionary conservation of some components in 
simplified genomes underpins the modular plasticity of the 
GPCR signaling system. 

We also performed a PCA of our eukaryote data set with 
the aim of elucidating different evolutionary tendencies (sup- 
plementary fig. 13, Supplementary Material online). We ob- 
served at least three clusters among eukaryotes that illustrate 
different patterns of evolution: expansion, simplification, and 
conservation of the GPCR signaling system. Principal compo- 
nent 1 is principally loaded by the core functional categories of 
the GPCR signaling system, clustering the most simplified taxa 
together, including strict parasites such as microsporidians, 
G. lamblia, trypanosomatids, Perkinsus marinus, or apicom- 
plexans. Interestingly, many autotrophic lineages, such as 
Archaeplastida and Cryptophyta, also have a considerably 
reduced complement of GPCRs. On the other hand, PC2 
differentiates between the two kinds of diversification of the 
GPCR signaling system. In a cluster characterized by the load- 
ing of G alpha and beta subunits, RGS, and cAMP receptors, 
we find some ichthyosporeans (A whisleri, P. gemmata and 
Am. parasiticum), N. gruberi, B. natans, and Al. macrogynus. 
Metazoans are differentiated in PC2 by the presence of 7tm1 , 
7tm2, GoLoco, and Frizzled. Therefore, our data indicate that 
the composition of the GPCR signaling system evolved repeat- 
edly toward a more complex pathway in various eukaryotic 
lineages. In particular, metazoans developed a more complex 
system through the expansion of GPCR signaling 
components. 

Reconstruction of GPCR Signaling Components in LECA 

We reconstructed the evolutionary stories of the various mod- 
ules throughout the eukaryotic branch of the tree of life 
(fig. 2) using the amorphea-bikont root for eukaryotes 
(Derelle and Lang 201 2) and taking into account the topology 



from the most recent phylogenomic studies (Brown et al. 
2012; Burki et al. 2012; Torruella et al. 2012). Our data 
show that most GPCR families are ancient and that some of 
the specific architectures of each family can be traced back to 
the eukaryotic ancestor. Therefore, the LECA already had a 
complex GPCR signaling system, as well as many other diver- 
sified gene families (Derelle et al. 2007; Fritz-Laylin et al. 201 0; 
Wickstead et al. 2010; Grau-Bove et al. 2013). Most interest- 
ingly, some complex GPCR architectures are conserved in 
bikonts (being B. natans the major example), contradicting 
the hypothesis that claims that canonical GPCR signaling 
through G proteins evolved in amorpheans (Bradford et al. 
2013). 

Discussion 

Our genomic survey and evolutionary reconstruction show 
that the LECA had a complex repertoire of GPCRs (fig. 6). 
Independent expansions of the GPCR signaling system oc- 
curred in some eukaryotic lineages, and, interestingly, most 
of the species that have these expansions are unicellular or 
colonial, such as B. natans, N. gruberi, and ichthyosporeans 
(supplementary fig. 13, Supplementary Material online). This 
supports the view that unicellular lifestyles also require com- 
plex signaling machineries (Crespi 2001). In fact, multicellular 
fungi such as the Basidiomycota Coprinus cinereus and the 
Ascomycota Tuber melanosporum have rather simpler com- 
plements of GPCRs than other unicellular, including chytrids 
and Mucoromycotina. Similarly, embryophytes possess a 
reduced GPCR signaling system. Of course, other signaling 
pathways are also present in eukaryotes, such as Histidine 
kinases, Serine/Threonine kinases, or Tyrosine Kinases 
(Anantharaman et al. 2007; Schaller et al. 201 1; Suga et al. 
2012), and these can have more important roles in the taxa 
where GPCR signaling is simplified. 

An important conclusion from our work is the modularity 
of the system. We find that some species have GPCRs without 
G proteins and vice versa, and we also show how different 
parts of the GPCR signaling system evolved independently, so 
that different functional categories involved in the pathway 
can become simplified without altering the others, as has 
been hinted at in other studies (Wilkie and Kinch 2005; 
Anantharaman et al. 2011). In addition, some parts of the 
pathway have diversified, both in terms of gene number 
and domain architecture, whereas other elements remain 
conservative. All this evidence suggests that the system is plas- 
tic and that drastic rearrangements can occur without com- 
plete loss of functionality. This robustness of eukaryotic 
signaling systems has been compared with the simpler and 
more direct signaling systems of prokaryotes (Anantharaman 
et al. 2007), and indeed modularity is a key feature of eukary- 
otic signaling pathways, which show great diversity of signal- 
ing machineries across different lineages (Anantharaman et al. 
2007; Schaller et al. 2011). 
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Modularity is not only observed in how the various ele- 
ments of the GPCR signaling pathway evolve but also at the 
level of protein domain architectures. Overall, our results on 
domain architectures clearly show that domain shuffling is a 
major mechanism of signaling system evolution. Indeed, per- 
vasive convergent evolution of domain arrangements is a 
major feature of both GPCR receptors and RGS proteins 
(Nordstrom et al. 2009; Anantharaman et al. 2011; 
Krishnan etal. 2012). However, because not all GPCR families 
are equally susceptible to acquiring new domains, functional 



constraints might also exist that prevent this evolutionary 
mechanism of innovation. 

A recent functional study in a subset of different G alpha 
subunits of various eukaryotes suggests that canonical GPCR 
signaling is restricted to amorpheans (Bradford et al. 2013). 
However, our results suggest some inconsistencies under that 
perspective. For example, the presence of Ric8 in heterokonts 
(including E. siliculosus tested in the study) may imply that in 
that lineage there is GEF activation of G protein alpha subunits 
and not only "self-activation." Also, the presence of both 
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metazoans, our results show a bimodal pattern of evolution 
of the elements of the GPCR signaling system. Cytoplasmic 
transduction elements, such as G proteins, Ric-8, GoLoco 
motif, Arrestins, and RGS families, are largely conserved be- 
tween unicellular holozoans and metazoans, both in terms of 
gene families and protein domain architectures (fig. 7). In con- 
trast, receptors underwent a dramatic expansion in meta- 
zoans compared with their closest unicellular relatives, and 
a similar pattern has also been observed for tyrosine 
kinases, Hippo signaling, and Notch signaling elements 
(Gazave et al. 2009; Sebe-Pedros et al. 2012; Suga et al. 
2012). The signaling output of GPCRs depends on the com- 
binatory of heterotrimeric G proteins and their regulators, 
and, remarkably, the combination that originated in ancient 
holozoans was already sufficient for transducing the huge 
amount of GPCR signaling inputs present in metazoans. The 
expansion of receptors is probably driven by metazoans' multi- 
cellularity, which co-opted the GPCR signaling system for 
many new functions, such as cell-cell communication, devel- 
opmental control, and most importantly in the case of GPCR, 
complex environmental sensing, from light sensing to odor 
and taste. We suggest that the shift from a universal eukary- 
otic signaling system to a dramatic expansion and refinement 
in metazoans played a key role in the acquisition of complex 
multicellularity. 



7TM-RGS and canonical GPCRs in opisthokonts (filastereans, 
ichthyosporeans, and early branching fungi) blurs the distinc- 
tion between GAP and GEF receptor-based G protein signal- 
ing, as they coexist in some lineages. Furthermore, the 
monophyly of lineage specific G alpha protein clades implies 
that each of those lineages had diversified their own reper- 
toires. Thus, there is not a conserved "self-activation" 
subfamily. 

Instead, "self-activation" could have evolved as a conver- 
gent character of G alpha subunits. Because only the activity 
of a single paralog of G alpha subunit has been tested for 
most lineages, it would be interesting to test more paralogs 
to clarify whether self-activation is the only mechanism 
in bikonts (Bradford et al. 2013). Finally, the presence of 
many GPCR types with functionally known amorphean 
domain architectures and rich heterotrimeric protein comple- 
ments in bikonts, such as in Phytophthora infestans and B. 
natans, suggest that they may have had a canonical GPCR 
signaling. Those species should be ideal to test different G 
alpha subunits experimentally. Overall, our results suggest 
that GPCR-G protein canonical signaling is older than previ- 
ously hypothesized, most likely already being functional at the 
LECA. 

Irrespectively, if the canonical GPCR signaling evolved in the 
root of amorpheans or before, regarding the origin of 
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